To survey researchers who had published on COVID-19, we extracted a list of corresponding author email addresses from Web of Science, drawing from all research articles, editorials, and review papers that mentioned COVID as a keyword. We then limited the pool to 35,830 email addresses that appeared more than once (indicating multiple publications) and emailed a random selection of 9955 researchers. This was the limit set by our institutional subscription to Alchemer, the survey platform we used. Recruitment and reminder emails, and a printout of the survey from Alchemer, are available at osf.io/3bn9u.
Because of the sensitive nature of the subject, Science applied for ethical review of the survey through BRANY (Biomedical Research Alliance of New York), an independent review board. The IRB protocol (BRANY SBER IRB #: 21-141-972) is available at osf.io/3bn9u.
Cathleen O’Grady, a contributing correspondent for Science, designed and conducted the survey, analyzed the data and published a story about it in the news section of Science magazine, March 25, 2022. Meta-scientist Tim Errington advised on the IRB process, survey methods, and statistical analysis. Martin Enserink, International News Editor at Science, provided editorial input.
Of the 9955 emails sent, 9585 were delivered, with some bounced emails presumably due to researchers changing institutions. We received 511 responses and removed one we questioned because it had identical answers across multiple questions. A similar survey was sent to 59,653 members of the American Association for the Advancement of Science, Science’s publisher, representing a wide range of disciplines; 1281 responded. In order to allow participants to share only the information they were comfortable sharing, and in line with the IRB-approved consent procedure, all questions were optional. This means that many questions were not answered by all participants, and have some “NA” (empty) responses.
Below, we show summary results for each question in the survey. As with any survey, it is very likely that certain kinds of people responded more than others. Targets of abuse may have been more likely to respond, and in order to try to limit the effect of this, our recruitment email specifically asked people with no experience of harassment to fill in the survey. However, the skew could also operate in the opposite direction: People who have experienced harassment could also be more worried about anonymity or less willing to discuss their experiences. Either way, the fact that respondents self-select into filling in the survey mean that the results cannot be taken to represent all COVID-19 scientists, or scientists more generally. These results also rely on self-report and on invididuals’ interpretations of our questions, which further undermine the quality of all survey data. Results should all be interpreted with caution.
Because of the number of different questions and because we did not have specific hypotheses about our results, we have treated this as an exploratory dataset. We have not used significance testing, which could lend undue weight to tentative findings, instead calculating only descriptive statistics and effect sizes. For a discussion of the inappropriate use of null hypothesis significance testing, see: Szucs, D., & Ioannidis, J. (2017). When null hypothesis significance testing is unsuitable for research: a reassessment. Frontiers in human neuroscience, 11, 390.
For the sake of transparency, and to allow further analysis of the dataset, data and analysis code are available at osf.io/3bn9u. However, given the sensitive nature of the subject, the public dataset has been anonymized, removing all potentially identifying details (including answers related to location, discipline, and demographics). We are open to collaboration in further analyzing the data, but access to the full dataset will be dependent on additional ethical review. Please contact us at intimidationsurvey@cathleenogrady.com if you are interested in this.
All rows without consent were removed during data cleaning.
Participants could select multiple discipline options. Free-text “other” responses covered a wide range of disciplines, including ethics, history of medicine, nutrition, and aerosol physics.
Free-text responses included industry advice, speaking to local community groups, writing opinion pieces, and being mentioned in (but not interviewed for) news articles about research.
To gauge each respondent’s overall level of exposure, we created a composite “exposure intensity” score that assigned a point for each separate publicity venue, multiplied by the middle range of each frequency category. For instance, someone who said they’d been on TV news would get one point; if they said this had happened twice, it would be 2 points; if they said 3-5 times, it would be 4 points (the middle of the 3-5 range). If they said they had also been on radio news twice, this would give them an additional two points, for a total score of 6.
104 out of 510 respondents reported no exposure at all, resulting in a score of 0. Progressively fewer researchers had higher and higher exposure scores.
Participants could select multiple options.
Free text “other” responses included ethics, environmental health, diagnostic testing, and aerosol physics.
Free text “other” responses included decarceration, masking in schools, school closure, contact tracing, and mental health.
Participants were asked a series of questions about whether they had experienced a range of harassment types as a result of their work on COVID-19. If a participant responded “yes” to a particular kind of harassment, they were then asked about venue, onset, and frequency of this type of harassment, before proceeding to the next question. This means that responses to questions about onset, frequency and venue are from a subset of the total number of participants (only those who responded “yes” to the relevant harassment question).
Each participant was assigned a “score” reflecting how many types of harassment they reported. For instance, a participant reporting only one kind of harassment would receive a score of 1; someone reporting eight kinds of harassment would receive a score of 8. Out of 510 respondents, 315 (61.8%) reported no harassment, and 195 (38.2%) reported at least one kind of harassment. Some participants report experiencing multiple categories of harassment.
| Harassment score | Number of respondents |
|---|---|
| 0 | 315 |
| 1 | 57 |
| 2 | 42 |
| 3 | 28 |
| 4 | 16 |
| 5 | 14 |
| 6 | 13 |
| 7 | 10 |
| 8 | 2 |
| 9 | 2 |
| 10 | 3 |
| 11 | 2 |
| 12 | 4 |
| 13 | 1 |
| 14 | 1 |
This survey found a much lower rate of harassment than a survey conducted by Nature, which found that 81% of 321 scientists surveyed reported at least occasional abuse. However, the survey samples differed substantially: Nature surveyed researchers on the COVID-19 media contact lists at science media centers in a range of countries, as well as researchers who had been prominent in media coverage. In contrast, our survey contacted researchers who had published on COVID-19, regardless of media coverage. This means that the sample surveyed by Nature likely had far higher rates of public attention, which may explain a much higher rate of abuse.
In order to compare samples more closely, at the request of Nature, we subsetted the data to include only those participants who reported at least one media interview (TV, radio, print, online, or other). 310 respondents reported at least one interview, and of these 310 respondents, 161 (52%) reported at least one type of harassment. The fact that this number is lower than the figure found by Nature may be at least partially explained by a difference in prominence: Researchers speaking frequently to science media centers likely have a higher profile than the researchers even in this subset of our sample, who may have given just one interview since the start of the pandemic.
Participants could select multiple categories. Personal insults and attacks on credibility or honesty were the most common, with protests, unwanted visits and physical intimidation much rarer.
Free text responses mentioned emails directly to employers.
Most respondents reported that the harassment began soon after the pandemic began, or within the last year. Onset of harassment within the past six months was much less common.
Some respondents reported frequent personal insults, attacks on credibility and honesty, and excessive contact from swarms or individuals. Some reported that even severe abuse, including doxxing and wishes of harm or death, had occurred more than 50 times.
To gauge each respondent’s overall level of harassment, we created a composite “harassment intensity” score that assigned a point for each separate kind of harassment, multiplied by the middle range of each frequency category. For instance, someone who said they had received personal insults would receive one point; if they said it had been twice, it would be 2 points; if they said 3-5 times, it would be 4 points (the middle of the 3-5 range). If they said they had also had death threats twice, that would give an additional two points, for a total score of 6.
Because these scores are based on the simple harassment scores described above, the same numbers apply: Out of 510 respondents, 315 (61.8%) have harassment intensity scores of zero, and 195 (38.2%) have scores of at least one. Those participants who reported multiple categories of harassment at very high frequencies have scores in the hundreds.
We calculated the association between harassment intensity and exposure intensity using Spearman’s rank correlation. There was a positive association between the variables, r = 0.56, 95% CI [0.50 , 0.62]. This is classified as a “large” effect size, but it is crucial to note that there could be different explanations for this relationship: For instance, publicity may lead directly to more harassment, but scientists with greater prominence may receive both more harassment and more interview invitations.
We calculated the association between each publicity topic and harassment intensity using point-biserial correlations. Here we report the correlation coefficient and 95% CI for each topic. As with any correlation, there could be multiple explanations for the appearance of a relationship. Fewer than 20 respondents reported advocating for some topics (not locking down, not restricting travel, not wearing masks, and other topics further down in the table), meaning that these results should be interpreted with extra caution.
| Lower 95% CI | Point-Biserial | Upper 95% CI | |
|---|---|---|---|
| Not treating with ivermectin | 0.2570934 | 0.3364098 | 0.4112242 |
| Natural origin of the virus being most likely | 0.2408022 | 0.3209184 | 0.3966908 |
| Vaccine passports or mandates | 0.2292160 | 0.3098748 | 0.3863070 |
| Wearing masks | 0.2146939 | 0.2960020 | 0.3732355 |
| Vaccination | 0.2099224 | 0.2914364 | 0.3689269 |
| Travel restrictions | 0.1973455 | 0.2793842 | 0.3575370 |
| Not treating with hydroxychloroquine | 0.1936352 | 0.2758238 | 0.3541678 |
| Lockdown | 0.1793888 | 0.2621319 | 0.3411923 |
| Hand-washing and hygiene | 0.1697844 | 0.2528823 | 0.3324096 |
| Ventilation | 0.1093000 | 0.1942821 | 0.2764448 |
| Not locking down | 0.1077453 | 0.1927679 | 0.2749911 |
| Not restricting travel | 0.0754311 | 0.1612012 | 0.2446035 |
| Not wearing masks | 0.0359045 | 0.1223490 | 0.2069761 |
| Not vaccinating certain groups (e.g. children) | -0.0185052 | 0.0684306 | 0.1543395 |
| Refusing vaccination | -0.0459133 | 0.0410763 | 0.1274477 |
| Treating with ivermectin | -0.0519541 | 0.0350298 | 0.1214862 |
| Lab origin of the virus being most likely | -0.0776761 | 0.0092119 | 0.0959610 |
| Treating with hydroxychloroquine | -0.1019538 | -0.0152630 | 0.0716578 |
Only participants who reported at least one category of harassment (n = 195) were asked questions in this section. Participants who reported no harassment moved directly to demographic questions.
Participants were asked four separate questions, one for each kind of support. If they said they had received a particular kind of support, they were then asked about their level of satisfaction with that support.
Very low numbers reported receiving support; however, of those who did receive support, the majority said they were satisfied or extremely satisfied with the help they received.
Free text responses included mentions of emotional support, including institutions making it clear that a researcher’s public engagement activity was valuable. But many other responses noted that no support of any kind had been available, and some included mention of universities instructing respondents to stop speaking publicly.
Many respondents indicated that they wanted emotional support from employers: recognition that their public engagement work is valuable, and enquiries regarding wellbeing.
Some responses noted that abuse had come from other academics, or noted other topics that had resulted in harassment, including vaccine side effects, ME/CFS, and the pandemic’s intersecton with racial inequality. One respondent noted that their institution had refused to delist contact details on the university website. Another noted that a supportive PhD advisor had greatly increased their feelings of safety.
One respondent noted that witnessing other researchers’ experiences of abuse had made them less likely to participate in public communication. Another noted that they had ended a vaccine information project on Facebook due to abuse. And one noted that they intended to leave the field due to the abuse they had experienced.
In order to avoid excluding any particular group, this was offered as a free text response. Given the responses to the following questions indicating that the survey population was predominantly ethnic majority/other groups who do not experience prejudice, these results were not analyzed.
This question was intended to capture a global range of minority identities.
In some countries, an ethnic majority may still experience systemic disadvantage (for example, Black people in South Africa). This question was designed to separate the experience of systemic disadvantage from the question of majority/minority status.
Many free text responses noted the effects of sexism in science. Others noted ageism, religious discrimination, and racism.
No associations were found between demographics and harassment. However, these results should be interpreted with extreme caution. As is clear from the demographic analyses above, there were very few racial, cultural, or sexual minorities in the survey sample, meaning that any real effect may not have been detectable in this sample. There may also be multiple explanations for a lack of association: For instance, people who feel particularly at risk of abuse may self-select out of public attention at a higher rate, and thereby avoid more extreme harassment.
| Lower 95% CI | Point-Biserial | Upper 95% CI | |
|---|---|---|---|
| Disability | -0.0178511 | 0.0760809 | 0.1686818 |
| Prejudice | -0.0723671 | 0.0222409 | 0.1164523 |
| Gender | -0.0818527 | 0.0122619 | 0.1061599 |
| Sector | -0.1047367 | -0.0071239 | 0.0906248 |
| Minority | -0.1046995 | -0.0105664 | 0.0837544 |
| Career stage | -0.1050066 | -0.0110959 | 0.0830110 |
| Sexual orientation | -0.1354216 | -0.0414964 | 0.0531680 |
| Age | -0.1549016 | -0.0626390 | 0.0307069 |
Science will donate:
$193.75 to COVID-19 Solidarity Response Fund
$162.50 to GiveDirectly
$143.75 to Malaria Consortium
With thanks to all participants.